Automatic Extraction of Conceptual Labels from Topic Models
نویسندگان
چکیده
În această lucrare prezentăm un sistem destinat extragerii automate de etichete conceptuale pentru topice obţinute prin metode statistice. Realizând o proiecţie a unei distribuţii peste toate cuvintele din vocabular pe ontologia WordNet reuşim asocierea de concept unor grupuri de cuvinte extrase folosind modele de topice. Contribuţiile cele mai importante ale lucrării sunt legate de validarea rolului acestor concepte ca etichete ale topicelor iniţiale şi determinarea corelaţiilor care apar între valoarea acestor etichete şi puterea relaţiei dintre concepte şi topice.
منابع مشابه
Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...
متن کاملPrediction-Constrained Training for Semi-Supervised Mixture and Topic Models
Supervisory signals have the potential to make low-dimensional data representations, like those learned by mixture and topic models, more interpretable and useful. We propose a framework for training latent variable models that explicitly balances two goals: recovery of faithful generative explanations of high-dimensional data, and accurate prediction of associated semantic labels. Existing app...
متن کاملEvaluating Visual Representations for Topic Understanding and Their Effects on Manually Generated Topic Labels
= {Probabilistic topic models are important tools for indexing, summarizing, and analyzing large document collections by their themes. However, promoting end-user understanding of topics remains an open research problem. We compare labels generated by users given four topic visualization techniquesword lists, word lists with bars, word clouds, and network graphsagainst each other and against au...
متن کاملEvaluating Visual Representations for Topic Understanding and Their Effects on Manually Generated Labels
= {Probabilistic topic models are important tools for indexing, summarizing, and analyzing large document collections by their themes. However, promoting end-user understanding of topics remains an open research problem. We compare labels generated by users given four topic visualization techniquesword lists, word lists with bars, word clouds, and network graphsagainst each other and against au...
متن کاملUsing Musical Structure to Enhance Automatic Chord Transcription
Chord extraction from audio is a well-established music computing task, and many valid approaches have been presented in recent years that use different chord templates, smoothing techniques and musical context models. The present work shows that additional exploitation of the repetitive structure of songs can enhance chord extraction, by combining chroma information from multiple occurrences o...
متن کامل